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This grant, which provided one half of the salary of a post- 
doctoral physicist was aimed at exploring the usefulness of the 
Massively Parallel Processor (MPP) at NASA Goddard Space Flight 
Center for investigation of electronic structures and hyperfine 
properties of atomic and condensed matter systems. Our major 
effort during this period was directed towards the preparation of 
algorithms for parallelization of the computational procedure we 
have been using on serial computers for electronic structure 
calculations in condensed matter systems. Before getting 
involved in the more complicated Hartree-Fock procedure, it was 
necessary for us, as outlined in the proposal, to gain experience 
in parallel computation using a simpler and approximate method 
termed Self-Consistent Charge Extended Huckel (SCCEH) method and 
to benchmark the performance of the MPP for our purposes as 
compared to the high-speed serial computer UNIVAC 1100/92 we have 
been using for our work over the past few years. 

About four months of the grant period was spent in 
familiarizing ourselves with the language and operation of the 
MPP. The MPP was accessed from our University through the 
TELENET and SPAN networks. In the remainder of the period of the 
grant, one of our main accomplishments was the adaptation of a 
major component of the SCCEH method, namely, the evaluation of 
the matrix elements the Hamiltonian used in the variational 
procedure employed to determine the electronic wave -f unc tion s of 
large molecules and clusters of atoms. This experience helped us 
in developing a general algorithm for calculating two-center one 
electron integrals that occur in the Hartree-Fock procedure, 
which we are interested in carrying out on the MPP. Lastly, as 
described in the proposal, our many-body atomic investigations 
require fast evaluation of a very large number of two-electron 
integrals of the coulomb and exchange types involved in the 
matrix elements of the electron-electron interaction over ground 
and excited state wave-functions. These integrals are usually 
carried out by standard quadrature procedures for numerical 
integrations. Towards the end of the grant period, we explored 
the adaptation of Gaussian and Laguerre quadrature procedures to 
the MPP for calculations of such integrals. 

Detailed descriptions of these investigations and results 
are described in the attached Appendix. These results were 
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presented on 24 April 1987 before an internal panel of the Space 
Computing and Image Analysis Division of the NASA Goddard Space 
Flight Center consisting of Dr. Milt Halem, Mr. James Fischer and 
associates. An article based on these results will be prepared 
in the future after some additional investigations, which 
unfortunately have been slowed down by the non-availability of 
support for personnel to work on this project. 
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APPENDIX 


The work reported in this Appendix represents the joint 
efforts of Dr. N. Sahoo (post-doctoral research associate), 
Department of Physics, SUNY Albany, Dr. S.N. Ray of Systems and 
Applied Science Corporation, Lanham, Maryland and Professor T.P. 
Das, Department of Physics, SUNY Albany. 


A. MPP-ADAPTION OF SELF-CONSISTENT CHARGE EXTENDED 

HUCKEL ( SCCEH ) PROCEDURE 


(I) The SCCEH Procedure 


The Self-Consistent-Charge Extended Huckel (SCCEH) 
procedure 1 , , is a semi-empirical method to determine the 
electronic energy levels and wavef unc tions of a molecule. This 
approach is based on the valence shell approximation where the 
one-electron Hamiltonian is defined only through its matrix 
elements in the atomic valence shell orbital basis. These matrix 
elements are not evaluated through explicit treatment of 
electron-electron interaction but by relating them to 
experimentally measured ionization energies of the pertinent 
atoms and ions. 

In this method the molecular orbitals (MO) are expressed 


a s 
in 


a linear combination 
the form 


of atomic orbitals (AO) 


'vlr = 7 C 
T 7 


|U<, 


r, 


a) 


in common with other LCAO procedures, the coefficients 
representing MO coefficients which are treated as variational 
parameters for calculating the minimum energy configuration of 
the molecule. The values of in the LCAOMO procedure for which 
the total energy is minimum are obtained by solving the secular 
equation: 

V 5 i <S £ ^ < 2) 

where H^j and in eq. (2) represent the Hamiltonian and 

overlap matrix elements respectively. The major computational 
steps involved in solving equation (2) for Cpi (which in turn 
determines the from which all the electronic properties are 
calculated) and (the one electron energy levels) for the 

SCCEH procedure are the following. 


1 . 


Determination of H 


* 4 ' 


empirically from the orbital ionization Energies Ei determined 
from atomic data, and the charges on atom 4. The charge q^oa 
any atom JL is obtained from the coefficents C^i, by using the 
Mulliken approximation < 


VL l Lo It 


with 


i 




*P 


(3) 




where o( and P represent spin states, ^u*' the populations of 
u.th occupied orbital for the two different spin states, and 
represents the valence charge on atom L . 

The q s are incorporated in H through the equations 

C£?-£°) 


u 


H u - 4 C + 


(4) 


2 . 


The determination of S 




For the determination of H^j one needs to evaluate S^j, 
overlap integral between atomic orbitals which is given by 


V V” ^ 


(5) 


the 


The atomic orbitals used in equations (1) and (5) are usually 
taken as Slater type orbitals'* (STO) but could equally well be 
taken as Gaussian type orbitals 4 (GTO). The expression for an 
STO is given by 


cN 


( 6 ) 


and that for a GTO is given by 




bi. JU. 


v '2 
* M 


(7) 
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with a^+b^+c^= the angular momentum of the atomic orbital. 

In equations (6) and (7), the exponents are chosen from 

analysis of the nature and features of the wave -func tions of the 
atoms involved in the molecular system studied. 


The overlap integrals are the simplest two center integrals 
encountered in the molecular electronic structure calculations. 
Since the evaluation of these integrals between a pair of atomic 
orbitals are independent of each other, they can be computed in 
parallel. 

(1) Overlap integrals over STOs- 

The detailed formalism for numerical evaluation of overlap 
integrals for Slater type orbitals is documented in the 
literature 5 . After careful examination of different formulae 
involved in the evaluation of such integrals, we decided on a 
particular area where we can use the MPP very effectively. The 
expression for the two center overlap integral can be 
algebraically reduced to expressions involving one or more basic 
two center integrals, known as reduced overlap integrals. The 
reduced overlap integrals are finally expressed in terms of the 
auxiliary functions A k (.P) and B k (.P) which are evaluated using the 
following recurrence relations: 

f AtJf) = KA K .,(f) +e~ r (s) 


whe re 


f 


2 . 


(9) 


and 

B*00 = g-o a*c-*)-AkW 


( 10 ) 


and 


a 


The expression for A^(p) and B ^ ( X ) are given by 

K K Cf)=5Ve- fX , 


(ID 


( 12 ) 
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Bk c * )= 5' / e ' Xf 


(13) 


In the above expressions R is the separation between atoms A and 
B, and 'Sp^ and'T^ are the Slater exponents of any atomic orbital 
centered on atom A and B respectively. 

The expressions (8-13) can be easily evaluated in the MPP 
with considerable saving of the computational time compared to a 
serial processor by using the following algorithm. 

For 128 basis functions, involving different , 

irrespective of their angular momentum symmetry, and X can be 
constructed as a 128 x 128 matrix. Thus A Q (f) and B Q (X) can be 
evaluated for 16,384 or X values simultaneously or in parallel 
using the MPP. Once A Q (j 3 ) and B Q (X) are evaluated, the A and 
Bj c (X) for k^O can be obtained using the recursion relations (8) 
and (10). The A^(f) and B^(X) can be evaluated simultaneously 
for 16 , 384 values of f' and X using the MPP. We have made 
substantial progress in programming this part in parallel pascal 
for MPP and the procedure will be called from the main FORTRAN 
program residing in host VAX memory. We have not done any bench 
marking yet for these integrals, but we expect the computation in 
MPP will be much faster compared to the serial processor from our 
experience for overlap integrals using Gaussian orbitals 
described next. Our main motivation behind the analysis of 
overlap integrals involving Slater orbitals was to gain 
experience In redesigning parallel algorithms for recursive 
numerical procedures similar to the ones described above. 

3 . Overlap Integrals over GTOs : 

The MPP can be utilized more effectively for evaluation of 
molecular integrals in general and overlap integrals in 
particular when one uses Gaussian type of atomic basis functions. 
In this case, all the molecular integrals can be analytically 
expressed in terms of some basic integrals, which can be 
simultaneously evaluated using the MPP. We will describe first 
the evaluation of overlap integrals between Gaussian functions 
and in the following section discuss the generalization of this 
procedure for other one electron two center integrals. 

The overlap integrals between two Gaussian type functions 
g A and G B> namel y> 4-TXl 

S AB = j <“> 

- to 

of any symmetry can be expressed^ in terms of overlap integral 
between two s-functions, < S)S > which is given by 


( 15 ) 
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For example, the overlap Integrals between hi and s orbitals 
on two centers and between and orbitals on two adjacent 

centers A and B can be written in the form: 

= <m>=Czy ~ J ^r 

T * • 

In the above expressions, 

e - c s A ■jg / ^ + r B ) C ab) 

(16) 

where (AB) is the square of the distance between two centers A 
and B, and^B are exponents in the Gaussian functions G A and 
Gg , N a and Ng are the normi 1 i z a t i on constants, 

Y ~ X\ + 

CAb\- C a - B )i ■ 

The <s\S> integrals are evaluated in the MPP using the following 
s te p s . 

— o 

(a) evaluate the distance matrix from which (AB) is calculated. 

(b) evaluate the 3^g matrix 

(c) evaluate the l 5 A +Ig matrix 

(d) evaluate the N A Ng matrix 

(e) evaluate the K matrix 

Since the MPP can do arithmetic operations involving 128X128 
parallel arrays, the Gaussian functions are divided into blocks 
of 128 Gaussians and all the parallel arrays or matrices involved 
in the steps a-e have 128X128 array dimension. 

As an example, we will describe the algorithm for the evaluation 
of the distance matrix for 128 points. For this purpose one 
first constructs three 128X128 parallel arrays with each column 
containing x, y or z coordinate of a single point, e.g. 


LO nt =*, OUnrtti, L E 3 nl =*i 





>— J r)i 

(17) 


for 

n = l , 128 and 1 = 1,128. 






For 

ins tance : 

r Xi 

*2.- • 

‘ * ’ ' X I2-S 




ii 

1 1 

» 

X x - • 




C 

with 128 rows . 


X-z - ' 

* * * Z l 2g _ 



One 

then calculates the transpose of 

ma trice s [ C ] , 

[D] , 

and [ E ] 


from which 



DO -L c 3-C c 3^ [y] = J>3-c»^ ra--IE]-frj T (19) 

are obtained. One can easily see that 


— M./VV 


( 20 ) 


Therefore the distance matrix for different centers can be 
obtained from 


( 21 ) 


In contrast to serial computation where the arithmetic operations 
are carried over one data item, like x^-x. here we are evaluating 
the difference between x-coor dina te s of all the 128 centers 
simultaneously. This leads to considerable saving in 
computational time. A similar procedure is followed in carrying 
out the steps b to d. Once the matrices in (a), (b) and (c) have 

been evaluated, the matrix (e) can be obtained directly using Eq. 
(16). The parallel pascal programs for these steps have been 
written and tested on the MPP. It is straight forward to 
calculate the overlap integrals < "2.1^ > and <B(^> from <S|s> using 
the equation (15). In actual molecular calculations, the atomic 
orbitals are expressed as linear combination of Gaussian 



Equation (23) can also be evaluated on the MPP by computing 

>, the overlap integrals between the primitives, and 
then multiplying it with the 4^ matrices and performing the 

appropriate contracted sum. i j/ 


Our bench-marking procedure has shown that these one 
electron two center integrals can be computed 15 times faster in 
MPP compared to UNIVAC 1100/91. 


4 . The Iterative Process : 

After the evaluation of Sj. by a MPP resident routine, the 
and H.j are evaluated from some initial charge state of the 
atoms andche secular equations (2) are solved to find out the 
for the next cycle. The solution of equations (2) requiring the 
diagonaliza tion of a large matrix, which may be done faster on 
the MPP using suitable parallel algorithm than the existing 
serial codes. We have been working on developing a parallel 



pascal code for matrix diagonaliza tion using available parallel 
algori thms. ' 

The MO coefficients C/ui. thus obtained are used to determine 
from equation (3) and the new H j. are constructed again using £*- 

equations (4) and the whole process is repeated until the charge 
on each atom stabilizes within a chosen tolerance limit. 

A host resident routine is used to carry out the iterative 
pr oce s s . 

We have thus made considerable progress in adapting the 
existing serial code for SCCEH procedure into the MPP. The novel 
architecture of the MPP is exploited wherever it is possible at 
the present time. 


(B) MPP ADAPTATION OF THE FIRST-PRINCIPLES HARTREE-FOCK 
CLUSTER PROCEDURE FOR ELECTRONIC STRUCTURES OF LARGE MOLECULES 
AND SOLID STATE SYSTEMS: 


Our ultimate aim is to adapt to the MPP the first- 
principles linear-combinations of Atomic Orbi tals-Molecular 
Orbital (LCAOMO) Hartree-Fock procedure to study the electronic 
structures of large molecules and atomic clusters used to 
stimulate solid state systems. Our experience in parallel 
evaluation of overlap integrals using the MPP has helped us to 
make some progress in this direction. The major computational 
effort involved in this procedure is the evaluation of F^ y , 
the matrix elements of the Hartree-Fock operator In the chosen 
atomic basis set which are given by 


Fjuv = j <jy: 15 C-^3 A t * a + 1 i fy co ( 5 £ 

P 2 5f,f 0 t- C ° 7" 4 > A Ci) Z> 


w 


-jf c0 i^ C 0 Jr ^ *2 


(24) 


In the above expression the first term is the kinetic energy 
integral, the second term is the potential energy integral and 
the third and fourth terms are the two electron coloumb and 
exchange integrals. The kinetic energy ( K. E . in te gr al s for 
Gaussian type atomic orbitals can be expressed in terms of the 
overlap integrals between S-type gaussian functions. The 
evaluation of such kinds of integrals by MPP has already been 


described in Appendix A. We are now developing a parallel pascal 
code for the evaluation of these K.E. integrals from the overlap 
integrals. We plan to develop parallel pascal codes for other 
electronic integrals in the future. 


The total energy of a cluster of atoms and ions used to 
simulate the infinite solid state system is the sum of electronic 
energy and the energy of repulsion (and attraction for opposite 
charge s ^ y NN between the charge on the ions. The electronic 
energy is obtained by solving a set of Har tree-Fock equations, 
whereas V NN is calculated using the expression, 



Zj and Rj being respectively the charge and position vector of 
the I th ion. 


We have written a parallel pascal code for the MPP following the 
algorithm described earlier for the evaluation of the distance 
matrix. We found that this can be done 30 times faster compared 
to the UNIVAC 1100/91 serial computer. 


C. MP P Adaption of the Many-Body Procedure for Atomic Systems : 

We have also been interested in adapting to the MPP the 
relativistic raany-body procedure' 1 for studying properties of 
atomic systems. The most time-consuming computation in this 
procedure is that requiring the evalution of the matrix elements 
of the electron-electron interaction involving three and four 
excited 10 ’ state wave functions of an atom of the form: 



In the above expression kl, k. 2 , k3 and k4 refer to unoccupied 
one-electron excited states and c to one of the occupied one- 
electron states for the many-elec tr on Dirac Hartree-Fock 
Hamiltonian. 



Usually these integrals are evaluated following the 
standard numerical procedures like Laguerre integration 1 
quadrature. To gain experience in designing parallel algorithms 
for numerical integrations, we have concentrated on writing MPP 
pascal codes for Laguerre quadrature involving simple functions. 
It will help us to develop suitable parallel codes for evaluation 
of I 3 and in equation (26). We will briefly describe the 
algorithm for the simultaneous evaluation of 



(27) 


on the MPP, for 128 different functions gj. The procedure can be 
easily generalized also for j 128. Ifone uses the Laguerre 


integration formula 


r2 


then equation (27) can be written as 


y\ 


</ i = l 






1 


(28) 


where are the zeros of Laguerre Polynomials, w^ are the weight 
factors. The number of terms n, in the summation is chosen 
depending on the desired accuracy. We will choose n=128 for the 
sake of clarity, but the procedure can be used for a smaller 
number of n values, as is normally done in this field. The x^ 
and w^e**’ are available from standard tables 12 and can be stored 
on the MPP as data. Each column of the 128X128 array of the MPP 
will contain the 128 x^ and w^e * 1 values such that 


Ley =x 

' — J cr) 


and 


\>1 


= 10; C 


'Ll 




(29) 


for n=l to 128 as in section 3 of Appendix A. In equations 
(29), [C] and [D] are 128X128 parallel array data items. 


If all the g.(x) in Eqs. (27) and (28) have the same 
algebraic form differing only by values of some specific 
parameters, one can evalauate the values of g^x^) at 128 x i 
values using the MPP. If they have different form, then one can 
use the host VAX computer to evaluate them and transfer them to 
the MPP. Let us assume that we have a parallel array data [G], 
whose elements are 

(30) 

Now * [D].. * f G 1 i 1 will give us the quantity 

f t (Xi) In Eq.(28; for all the 12s functions at the 128 x^ values. 


Thus 


for 128 values of i and j. 


[F\= t/ XiMe 


'LL 


From Eq. (28) it then follows that 





(31) 


This addition can be easily done by repeatedly using the 
predeclared shift functions in MPP pascal 1J language and adding 
to the parallel array on which the shift operation is applied. 
Thus: 


Suppose [ S ] 1 = shift ([S]* - ^, n^, o) where nj. is chosen such that 
in each successive shift operations we north shift the parallel 
array by twice the number of rows of the previous shift 
operation, that is, n^ * 2n^ until r)^ ( - . 

Then, PR 1—21 + C 5 U (32) 

u {.» i 

where the resulting parallel array [R] will be such that 


1 V ~' a “ (33) 

The number of arithmetic operations involved in carrying 
out the numerical Integration of 128 functions by the above 
procedure is much less compared to the number of arithmetic 
operations involved in the serial computation. For example the 
evaluation of [ H ] on MPP requires three multiplication operations 
compared to 49152 multiplications on serial computers. 

Similarly, in the summation in equation (28) for 128 functions, 
one needs only 7 addition operations compared to the 16384 
required additions on serial computers. Thus numerical 
integration of large numbers of functions will be much faster on 
the MPP compared to the conventional serial processors. When one 
uses smaller numbers of zeros of Laguerre Polynomials in Eq. (28) 
one can increase the number j of functions to be integrated such 
that all the processors in the array are properly utilized. We 
hope to benchmark the performance of the MPP pascal program 
following the above procedure in the future. 


D. CONCLUSION 


In the past one year we have made progress In using the MPP 
efficiently for electronic structure calculations of atomic and 
solid state systems. It is a different computing environment 
than with the conventional serial computers used before and one 
has to redesign the parallel algorithms and rewrite computer 
codes to convert existing FORTRAN codes used for electronic 
structure calculations. Non-availability of a scientific 
subroutine library in MPP Pascal language makes the conversion 
more difficult and time consuming. Thus we will devote further 
efforts in the coming years to write suitable MPP Pascal 



subprograms for many of the standard numerical procedures that 
are required for our investigations. 
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